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Abstract 

Data replications and transaction deadlocks can severely af- 
fect the performance of distributed database systems. Many 
current evaluation techniques ignore these aspects, because it 
is difficult to evaluate through analysis and time-consuming to 
evaluate through simulation. In this paper, we use a technique 
that combines simulation and analysis to closely illustrate the 
impact of deadlock and evaluate performance of replicated dis- 
tributed database with both shared and exclusive locks. 

1. Introduction. 

A distributed database system (DDS) is a collection of co- 
operating nodes each containing a set of data items. A user 
transaction can enter such a system at any of these nodes, 1 he 
receiving node, often referred to as the coordinating node, un- 
dertakes the task of locating the nodes that contain the data 
items required by a transaction. 

In order to maintain database consistency and correctness 
in the presence of concurrent transactions, several concurrency 
control protocols have been proposed (l[. Of these, the most 
commonly used are time-stamping and locking protocols. Lock- 
ing protocols have been widely used in both commercial and 
research environments. In static locking, prior to start of exe- 
cution, a transaction needs to acquire either a shared- lock (for 
read operations) or an exclusive lock (for update operations) on 
each of the relevant data items. 

Data replication is used to improve the performance of local 
transactions and the availability of databases. In replicated 
databases, one data item may have more than one copy in 
the system. Replica control algorithms arc used to maintain 
the consistency among these copies. One of these is the rcad- 
one/write-all protocol. With this protocol an exclusive lock 
need to acquire an exclusive lock from every copy of the data 
item . For a shared lock to succeed, any one copy of the data 
item has to be share locked. When transactions with conflicting 
lock requests are initiated concurrently, they could be possibly 
blocked due to a deadlock. 

There are two major ways to evaluate the performance of 
distributed systems: simulation and analysis. Simulation is a 
conceptually tractable technique, but requires large computa- 
tion time. On the other hand, analysis is computationally faster 
but may not be tractable for all problems. In [4], Shyu and Li 
proposed an elegant analysis model to evaluate the response 
time and throughput of transactions in a non-rcplicated DDS. 
Assuming exclusive locking (i.e., only write operations), they 
model the queue of lock requests at an object as an M/M/l 
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queue [3]. This results in a closed-form for the waiting time 
distribution at a node, expressed in terms of the average rates 
of arrivals of requests and the average lock- holding time. With 
shared lock and replications added into the picture, it is very 
difficult to have a close model for it. Because of the limita- 
tions of simulation and analysis, we develop a technique that 
combines simulation and analysis. 

This paper is organized as the follows. In Section 2, we de- 
scribe the model used in our performance evaluation. In Section 
3, we propose an evaluation technique. In Section 4, we illus- 
trate the results. Finally, Section 5 has the conclusions. 

2. Model 

Our model lias the following parameters: 

• There are n nodes. 

• There are d data items in a DDS. 

• A data item may be located at exactly c numl>er of nodes. 
The dc data copies are uniformly distributed across the n 
nodes. 

• Each transaction accesses k data items. 

• r is the read ratio. So among k data items to be accessed, 
rk are accessed only for read operations, and the rest 
arc for read- write operations. Due to the read-one/write- 
all replica control policy, a transaction must procure rk 
shared locks for rk read operations and ( l - r)kc exclusive 
locks for the (1 — r)k read-write operations. 

• Each data item is equally likely to be accessed by a trans- 
action. 

• Transaction arrivals into the system is a Poisson process 
with rate A. 

• The communication delay between any two nodes is ex- 
ponentially distributed with mean l. 

• The average execution time of a transaction, once the 
locks are obtained, is 3. 

• The deadlock mechanism is invoked every r seconds. 

• After an abortion of a transaction, it takes an average of 
u> seconds for this transaction to be restarted. 

• p is the service rate of transactions. 

• 6 is the lock-holding time. 

• Ac is the arrival rate at each data copy. 
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3. Performance Evaluation Technique 

Our technique consists of two stages. In the first stage, the 
average transaction response time and throughput are calcu- 
lated by ignoring the deadlock. This is an iterative step involv- 
ing simulation and analysis. In the second stage, the proha- 
bilities of transaction conflicts and deadlocks are computed by 
probability models. These probabilities are used, in turn, to 
compute the response time and throughput in the presence of 
deadlocks. 

Stage 1: 

Initially, we assume that there are no lock conflicts between 
transactions. Each transaction has to procure rk shared lock 
on data copies and (1 - r)kc exclusive locks on data copies. 
When a transaction has got all the lock grants from these data 
objects, it can go ahead with execution. 

This procedure is summarized in the following fi steps. 

1. Initialize lock-holding time(6) to be 1/p. 

2. Given the total rate of transaction arrival(A), the shared 
lock ratio(r), the number of data items(d), the number of 
data items required by each transaction! *•) and I he mini, 
ber of roplications(c), derive the arrival rate at each data 
copy (Ac). 

3. With the arrival rate at each data copy(Ac), the average 
lock-holding time(6), and the transmission time(f) we can 
simulate the queue at a data copy to arrive wait-time(w) 
distribution. With this distribution we can calculate the 
response time of transactions. 

4. With the average service time of transactions! I/p), and 
the transmission time, we can derive a new lock-holding 
time(fc'). 

5. Set b to this new lock-holding time U . 

6. If the old and new lock-holding time arc suIRnentlv close 
stop the iteration. Otherwise, go back to step ,J. 

At the end of stage 1 the response time without the considera- 
tion of transaction deadlocks is obtained. 

Stage 2: 

This stage considers transaction conflicts and computes the 
deadlock probability. Here the probabilities of transaction dead- 
lock and restart are computed. These are then used to compute 
response time and throughput in the presence of deadlocks. 

Assume there are two transactions Tl and T2. Let RS, WS 
be the read and write sets of transactions respectively 


1. Let fsi be the probability that the readset of Tl has i data 
items overlapping with the writeset of T2, i e \fiS(Tl ) n 
WS(T2)\ = i. 


2. Let /e,y be the probability that given |/?5(Tl )nll'5(7'2)| = 
i, the writeset of Tl has j data items overlaping with 
the readset ami writeset of T2, i.e. the probability that 

|IV5(7'l)n(«5(r2)UlV5(7’2))| =>. 


~ «) 

( 1 ) 


( 2 ) 


It can also be noted that /s,/e„ is the probability that: 
|Rcad-sct(Tl )n Write-sot (T2)|=i 
A |Write-set(Tl )n( Write-set(T2)URead-set(T2))|=>. 

If PW,, is the probability that Tl waits for T2, 

rW 'j = Rl + p2 - pi * p2 (3) 

pi = i-ii-om (4) 

p2 = 0- (l/2)‘>) (5) 

where pi is the probability that Tl waits for T2 for shared locks 
in readset 

and p2 is the probability that Tl waits for T2 for exclusive locks 
in writeset. 

Probability that Tl waits for T2 is now given by 

inin(r>-x) min(A-. 1 ) 

l ’ w E E (o 

i=0 ;= 0 

With this probability of waiting and the formulas in [4 J we can 
calculate the probability of a transaction deadlock, the prob- 
ability of a transaction restart and the probability of a trans- 
action to be blocked by other transactions. And with these 
probabilities and the time between deadlock detection(r), we 
can calculate the response lime with consideration of deadlock. 
(Details are ornetled here.) 


4. Results 

Using this technique, we obtained a number of interesting 
results that illustrate the effect of deadlocks and number of 
replications on database performance. These are summarized 
in figures 1-5. We make the following observations. 

• Transaction response times are quite sensitive to the ratio 
of shared locks (Figure 1 and 2). Here, we compare the re- 
sponse times when deadlocks are ignored (DI, computed in 
Stage 1 ) with those obtained when deadlocks are consid- 
ered (DC, computed in Stage 2). The effect of deadlocks 
is more predominant at higher transaction loads and with 
smaller values of r. When r = 2/3, the effect of deadlocks 
is not significant on response time. 

• If we compare Figure I and 2 with Figure 3 and 4, it can 
be observed that the increase in replications results in the 
larger response time when read ratio is smaller than 1/3. 

• Fig. 5 shows the response times with different replication 
numbers. Here we can see that with both cases when 
read ratio is 2/3 and 1/3, the response time increases as 
the number of replications increases. Out with read ratio 
equals 1/3, the increasing rate is much smaller than that 
with road ratio equals 2/A. 


Clearly, 


5. Conclusions 

In [4], Shy u and Li presented an elegant technique to eval- 
uate the performance of distributed database systems in the 
presence of deadlocks. Their technique assumed only exclusive 
locks and thus representing the worst-case effects of deadlocks. 
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Figure. I Comparison of response umc with different 
read ratio when deadlock is ignored. 
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Figure.4 Comparison of response time with different 
read ratio when deadlock is ignored, 
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Figure. 2 Comparison of response umc with different 
read ratio when deadlock is considered. 



Figure. 3 Comparison of response time with different 
read ratio when deadlock is ignored. 
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Figure. 5 


replication number c ^ 
Comparison of response lime with difTerm 
read ratio with and without deadlock 


DC; Deadlock Considered. 
Dl: Deadlock Ignored. 


In this paper, we have extended their technique to combined 
ulation and analysis. And with this extended technique 
both shared and exclusive locking and also replications it * 
model. We evaluated the the effect of number of data itcrm.fr 
number of data items accessed by each transaction, ther*t*4 
read operations on transaction response time and the nurr^ 
replications. These results show the importance of consxta^ 
both shared and exclusive lock requests, the deadlock pail 
bill tics as well as the number of replications of databa# J 
response time evaluations. 
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